AITopics

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.88)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(2 more...)

Neural Information Processing SystemsFeb-15-2026, 15:56:57 GMT

Differentiable and Stable Long-Range Tracking of Multiple Posterior Modes

But, they are limited to unimodal, Gaussian approximations of state uncertainty.

artificial intelligence, machine learning, particle, (20 more...)

Country:

North America > United States > New Jersey (0.04)
North America > United States > California > Orange County > Irvine (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)

arXiv.org Artificial IntelligenceNov-12-2025

Diffusion Guided Adversarial State Perturbations in Reinforcement Learning

Sun, Xiaolin, Liu, Feidi, Ding, Zhengming, Zheng, ZiZhan

Reinforcement learning (RL) systems, while achieving remarkable success across various domains, are vulnerable to adversarial attacks. This is especially a concern in vision-based environments where minor manipulations of high-dimensional image inputs can easily mislead the agent's behavior. To this end, various defenses have been proposed recently, with state-of-the-art approaches achieving robust performance even under large state perturbations. However, after closer investigation, we found that the effectiveness of the current defenses is due to a fundamental weakness of the existing $l_p$ norm-constrained attacks, which can barely alter the semantics of image input even under a relatively large perturbation budget. In this work, we propose SHIFT, a novel policy-agnostic diffusion-based state perturbation attack to go beyond this limitation. Our attack is able to generate perturbed states that are semantically different from the true states while remaining realistic and history-aligned to avoid detection. Evaluations show that our attack effectively breaks existing defenses, including the most sophisticated ones, significantly outperforming existing attacks while being more perceptually stealthy. The results highlight the vulnerability of RL agents to semantics-aware adversarial perturbations, indicating the importance of developing more robust policies.

diffusion model, machine learning, reinforcement learning, (19 more...)

2511.07701

Country: Asia (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsOct-9-2025, 00:23:00 GMT

Differentiable and Stable Long-Range Tracking of Multiple Posterior Modes

But, they are limited to unimodal, Gaussian approximations of state uncertainty.

artificial intelligence, machine learning, particle, (20 more...)

Country:

North America > United States > New Jersey (0.04)
North America > United States > California > Orange County > Irvine (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)

Neural Information Processing SystemsSep-30-2025, 11:48:47 GMT

Online Learning of Dynamic Parameters in Social Networks

This paper addresses the problem of online learning in a dynamic setting. We consider a social network in which each individual observes a private signal about the underlying state of the world and communicates with her neighbors at each time period. Unlike many existing approaches, the underlying state is dynamic, and evolves according to a geometric random walk. We view the scenario as an optimization problem where agents aim to learn the true state while suffering the smallest possible loss. Based on the decomposition of the global loss function, we introduce two update mechanisms, each of which generates an estimate of the true state. We establish a tight bound on the rate of change of the underlying state, under which individuals can track the parameter with a bounded variance. Then, we characterize explicit expressions for the steady state mean-square deviation(MSD) of the estimates from the truth, per individual. We observe that only one of the estimators recovers the optimal MSD, which underscores the impact of the objective function decomposition on the learning quality. Finally, we provide an upper bound on the regret of the proposed methods, measured as an average of errors in estimating the parameter in a finite time.

dynamic parameter, name change, online learning, (4 more...)

Industry:

Information Technology > Services (0.66)
Education > Educational Setting > Online (0.66)

Technology: Information Technology > Artificial Intelligence (0.40)

arXiv.org Artificial IntelligenceAug-4-2025

Towards a Measure Theory of Semantic Information

Coghill, George M.

A classic account of the quantification of semantic information is that of Bar-Hiller and Carnap. Their account proposes an inverse relation between the informativeness of a statement and its probability. However, their approach assigns the maximum informativeness to a contradiction: which Floridi refers to as the Bar-Hillel-Carnap paradox. He developed a novel theory founded on a distance metric and parabolic relation, designed to remove this paradox. Unfortunately is approach does not succeed in that aim. In this paper I critique Floridi's theory of strongly semantic information on its own terms and show where it succeeds and fails. I then present a new approach based on the unit circle (a relation that has been the basis of theories from basic trigonometry to quantum theory). This is used, by analogy with von Neumann's quantum probability to construct a measure space for informativeness that meets all the requirements stipulated by Floridi and removes the paradox. In addition, while contradictions and tautologies have zero informativeness, it is found that messages which are contradictory to each other are equally informative. The utility of this is explained by means of an example.

artificial intelligence, information, natural language, (18 more...)

2508.00525

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.85)

arXiv.org Artificial IntelligenceMar-15-2025

Decentralized Hidden Markov Modeling with Equal Exit Probabilities

Sui, Dongyan, Zheng, Haitian, Leng, Siyang, Vlaski, Stefan

Social learning strategies enable agents to infer the underlying true state of nature in a distributed manner by receiving private environmental signals and exchanging beliefs with their neighbors. Previous studies have extensively focused on static environments, where the underlying true state remains unchanged over time. In this paper, we consider a dynamic setting where the true state evolves according to a Markov chain with equal exit probabilities. Based on this assumption, we present a social learning strategy for dynamic environments, termed Diffusion $\alpha$-HMM. By leveraging a simplified parameterization, we derive a nonlinear dynamical system that governs the evolution of the log-belief ratio over time. This formulation further reveals the relationship between the linearized form of Diffusion $\alpha$-HMM and Adaptive Social Learning, a well-established social learning strategy for dynamic environments. Furthermore, we analyze the convergence and fixed-point properties of a reference system, providing theoretical guarantees on the learning performance of the proposed algorithm in dynamic settings. Numerical experiments compare various distributed social learning strategies across different dynamic environments, demonstrating the impact of nonlinearity and parameterization on learning performance in a range of dynamic scenarios.

artificial intelligence, machine learning, probability, (17 more...)

2503.12153

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.50)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Bhargav, Jayanth, Sundaram, Shreyas, Ghasemi, Mahsa

Robust Information Selection for Hypothesis Testing with Misclassification Penalties

arXiv.org Machine LearningFeb-20-2025

We study the problem of robust information selection for a Bayesian hypothesis testing / classification task, where the goal is to identify the true state of the world from a finite set of hypotheses based on observations from the selected information sources. We introduce a novel misclassification penalty framework, which enables non-uniform treatment of different misclassification events. Extending the classical subset selection framework, we study the problem of selecting a subset of sources that minimize the maximum penalty of misclassification under a limited budget, despite deletions or failures of a subset of the selected sources. We characterize the curvature properties of the objective function and propose an efficient greedy algorithm with performance guarantees. Next, we highlight certain limitations of optimizing for the maximum penalty metric and propose a submodular surrogate metric to guide the selection of the information set. We propose a greedy algorithm with near-optimality guarantees for optimizing the surrogate metric. Finally, we empirically demonstrate the performance of our proposed algorithms in several instances of the information set selection problem.

algorithm, information source, penalty, (16 more...)

arXiv.org Machine Learning

2502.14738

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceNov-19-2024

On Diffusion Models for Multi-Agent Partial Observability: Shared Attractors, Error Bounds, and Composite Flow

Wang, Tonghan, Dong, Heng, Jiang, Yanchen, Parkes, David C., Tambe, Milind

Multiagent systems grapple with partial observability (PO), and the decentralized POMDP (Dec-POMDP) model highlights the fundamental nature of this challenge. Whereas recent approaches to addressing PO have appealed to deep learning models, providing a rigorous understanding of how these models and their approximation errors affect agents' handling of PO and their interactions remain a challenge. In addressing this challenge, we investigate reconstructing global states from local action-observation histories in Dec-POMDPs using diffusion models. We first find that diffusion models conditioned on local history represent possible states as stable fixed points. In collectively observable (CO) Dec-POMDPs, individual diffusion models conditioned on agents' local histories share a unique fixed point corresponding to the global state, while in non-CO settings, the shared fixed points yield a distribution of possible states given joint history. We further find that, with deep learning approximation errors, fixed points can deviate from true states and the deviation is negatively correlated to the Jacobian rank. Inspired by this low-rank property, we bound the deviation by constructing a surrogate linear regression model that approximates the local behavior of diffusion models. With this bound, we propose a composite diffusion process iterating over agents with theoretical convergence guarantees to the true state.

agent, diffusion model, history, (15 more...)

2410.13953

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Middle East > Cyprus > Pafos > Paphos (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Thangeda, Pranay, Betz, Trevor S., Grussing, Michael N., Ornik, Melkior

InfraLib: Enabling Reinforcement Learning and Decision Making for Large Scale Infrastructure Management

arXiv.org Artificial IntelligenceSep-4-2024

Efficient management of infrastructure systems is crucial for economic stability, sustainability, and public safety. However, infrastructure management is challenging due to the vast scale of systems, stochastic deterioration of components, partial observability, and resource constraints. While data-driven approaches like reinforcement learning (RL) offer a promising avenue for optimizing management policies, their application to infrastructure has been limited by the lack of suitable simulation environments. We introduce InfraLib, a comprehensive framework for modeling and analyzing infrastructure management problems. InfraLib employs a hierarchical, stochastic approach to realistically model infrastructure systems and their deterioration. It supports practical functionality such as modeling component unavailability, cyclical budgets, and catastrophic failures. To facilitate research, InfraLib provides tools for expert data collection, simulation-driven analysis, and visualization. We demonstrate InfraLib's capabilities through case studies on a real-world road network and a synthetic benchmark with 100,000 components.

infralib, infrastructure management, infrastructure system, (14 more...)

2409.03167

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report (0.51)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (0.89)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)